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Abstract. We employ the methods of statistical physics to study the performance of 
Gallager type error-correcting codes. In this approach, the transmitted codeword com- 
prises Boolean sums of the original message bits selected by two randomly-constructed 
sparse matrices. We show that a broad range of these codes potentially saturate Shan- 
non's bound but are limited due to the decoding dynamics used. Other codes show 
sub-optimal performance but are not restricted by the decoding dynamics. We show 
how these codes may also be employed as a practical public-key cryptosystem and are 
of competitive performance to modern cyptographical methods. 



Error- correcting codes are of significant practical importance as they provide 
mechanisms for retrieving the original message after possible corruption due to 
noise during transmission. They are being used extensively in most means of infor- 
mation transmission from satellite communication to the storage of information on 
hardware devices. The coding efficiency, measured in the fraction of informative 
transmitted bits, plays a crucial role in determining the speed of communication 
channels and the effective storage space on hard-disks. Rigorous bounds [1] have 
been derived for the maximal channel capacity for which codes, capable of achieving 
arbitrarily small error probability, can be found. However, most existing practical 
error-correcting codes are significantly far from saturating this bound and the quest 
for more efficient error-correcting codes has been going on ever since. 

One family of codes, introduced originally by Gallager [2], and abandoned in 
favour of other codes due to the limited computing facilities of the time, has re- 
cently been re- introduced [3], showing excellent performance with respect to most 
existing codes. In fact, it has recently been discovered that irregular constructions 
of Gallager's code result in better performance than any other method [4,5] and 
nearly saturate Shannon's bound for infinite message size. Gallager-type methods 
are generally based on the introduction of random sparse matrices for generating 
the transmitted codeword as well as for decoding the received corrupted codeword. 
Various decoding methods have been successfully employed; we will mainly focus 
here on the leading technique of Belief Propagation (BP) [6]. 



Most studies of Gallager-type codes conducted so far have been carried out via 
numerical simulations. Some analytical results have been obtained via methods of 
information theory [3], setting bounds on the performance of certain code types, and 
by combinatorial/statistical methods [4]. Here we analyze their typical performance 
for several parameter choices via the methods of statistical physics, and validate 
the analytical solutions against results obtained by the Thouless- Anderson-Palmer 
(TAP) approach [7] to diluted systems and via numerical methods. 

In a general scenario, the dimensional Boolean message ^ is encoded to the 
M dimensional vector J° which is then transmitted through a noisy channel with 
flipping probability p per bit (other noise types may also be considered). The 
received message J is decoded to retrieve the original message. 

One can identify several slightly different versions of Gallager-type codes. The 
one used here, termed the MN code [3] is based on choosing two randomly-selected 
sparse matrices A and B of dimensionality MxN and MxM respectively; these are 
characterized by K and L non-zero unit elements per row and C and L per column 
respectively. The finite, usually small, numbers K, C and L define a particular 
code; both matrices are known to both sender and receiver. Encoding is carried 
out by constructing the modulo 2 inverse of B and the matrix B~^A (modulo 2); 
the vector J'^ — B'^A ^ (modulo 2, ^ in a Boolean representation) constitutes the 
codeword. Decoding is carried out by taking the product of the matrix B and the 
received message J — J'^+C (modulo 2). The equation 

A^ + BC^AS + Bt (mod 2), (1) 

is solved via the iterative methods of BP [3] to obtain the most probable Boolean 
vectors S and r; BP methods in this context have recently been shown to be 
identical to a TAP based solution of a similar physical system [8] . 

The similarity between error-correcting codes of this type and Ising spin systems 
was first pointed out by Sourlas [9], who formulated the mapping of a simpler 
code onto an Ising spin system Hamiltonian. To facilitate the current investigation 
we first map the problem onto that of an Ising model with finite connectivity. 
We employ the binary representation (±1) of the dynamical variables S and r 
and of the vectors J and J° rather than the Boolean (0, 1) one; the vector J° is 
generated by taking products of the relevant binary message bits J^ = Y[iGfi 6; which 
correspond to the non-zero elements of B^^A, producing a binary version of J°. 
As we use statistical mechanics techniques, we consider the message and codeword 
dimensionality {N and M respectively) to be infinite, keeping the ratio R — N/M, 
which constitutes the code rate, finite. Using the thermodynamic limit is natural 
as Gallager-type codes are used to transmit long (10^ — 10^) messages, where finite 
size corrections are likely to be negligible. We examine the Hamiltonian 
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The tensor product V^„J^„, where J^^ = UietiCiUjeaCj and a = is 
the binary equivalent of A^ + B(^, treating both signal {S and index i) and noise 



(r and index j) simultaneously. Elements of the sparse connectivity tensor V^^ 
take the value 1 if the corresponding indices of both signal and noise are chosen 
(i.e., all corresponding indices of A and B arc 1) and otherwise; it has C unit 
elements per i-indcx and L per j-indcx, representing the system's degree of con- 
nectivity. The 6 function provides 1 if the selected sites' product Hie// Yij&a '^i 
is in disagreement with J'^^^, recording an error, and otherwise. Note that this 
term is not frustrated, as there are M + N degrees of freedom and M constraints 
(1). The two terms on the right represent our prior knowledge in the case of bi- 
ased messages Fg and of the noise level F^, and require assigning certain values to 
these additive fields. The choice of /3— >oo imposes the restriction of Eq.(l), while 
the last two terms remain finite. Note that the noise dynamical variables r are 
irrelevant to measuring the retrieval success m — (j2iLi 6 sign {Si)^^^ . The 

latter monitors the normalized mean overlap between the Bayes-optimal retrieved 
message, corresponding to the alignment of {Si)^ to the nearest binary value [9], 
and the original message; the subscript f3 denotes thermal averaging. The selec- 
tion of elements in V introduces disorder to the system; we calculate the partition 
function Z{'D, J) — Tr^^ exp[— averaged over D, ^ and C using the replica 
method [8]. We employ the replica symmetry ansatz to obtain a set of saddle point 
equations with respect to the emerging continuous order parameters, representing 
local field probability distributions and the respective conjugate distributions [10]. 

For unbiased messages and either K>3 {L>2) or L > 3 {K>2) we obtain both 
the ferromagnetic and paramagnetic solutions either by applying the TAP approach 
or by solving the saddle point equations numerically The former was carried out 
at the values of Fr and Fs = 0) which correspond to the true noise and input bias 
levels (for unbiased messages = 0) and thus to Nishimori's condition [11]. This 
is equivalent to having the correct prior within the Bayesian framework [9]. 

The most interesting quantity to examine is the maximal code rate, for a given 
corruption process, for which messages can be perfectly retrieved. This is defined 
in the case oi K,L > 3 hy the value oi R = K/C = N/M for which the free 
energy of the ferromagnetic solution becomes smaller than that of the paramagnetic 
solution, constituting a first order phase transition. The critical code rate obtained 
Rc = l—H2{p) = l+(plog2 p+(l — p) log2(l — p)) , coincides with Shannon's capacity. 

The MN code for K,L > 3 seems to offer optimal performance. However, the 
main drawback is rooted in the co-existence of the stable m = 1, solutions, which 
imphes that from most initial conditions the system will converge to the undesired 
paramagnetic solution. Studying the ferromagnetic solution numerically shows a 
highly limited basin of attraction, which becomes smaller as K and L increase, 
while the paramagnetic solution at m = always enjoys a wide basin of attraction. 

Studying the case of = L = 2 , indicates the existence of paramagnetic and 
ferromagnetic solutions depicted in the inset of Fig.l. For corruption probabilities 
p>Ps one obtains either a dominant paramagnetic solution or a mixture of ferro- 
magnetic (m = ±1) and paramagnetic (m = 0) solutions. Reliable decoding may 
only be obtained for p<Ps, which corresponds to a spinodal point, where a unique 



FIGURE 1. Critical transmission 
rate as a function of p, obtained an- 
alytically (O) and via BP (+) iterative 
solutions (N = 10*) for unbiased mes- 
sages (averaged over 10 different ini- 
tial conditions); error bars are smaller 
than the symbol size. Shannon's bound 
(solid line) is shown for reference. In- 
set: The ferromagnetic (F) and param- 
agnetic (P) solutions as functions of p; 
thick and thin lines denote stable solu- 
tions of lower and higher free energies 
respectively, dashed lines correspond to 
unstable solutions. Lines between the 
m = ±1 and m = axes correspond to 
sub-optimal ferromagnetic solutions. 

ferromagnetic solution emerges ai m — l (plus a mirror solution at m — — l). Initial 
conditions for both simulations (TAP/BP) and the numerical solutions were chosen 
almost randomly, with a slight bias of 0(1O~^^), in the initial magnetization. The 
results obtained point to the existence of a unique pair of global solutions to which 
the system converges (below Ps) from all initial conditions. 

The main question that emerges is the possibility of producing more complicated 
constructions for which the spinodal point is closer to Shannon's critical flip rate. 
This has been based mainly on the introduction of irregular constructions [4,5], 
where the number of unit elements per row/column in the matrices A and B is not 
fixed. Analytical investigation [12] aimed at optimising the construction are yet to 
provide a principled method for carrying out the optimisation. 

The study of parity check codes and the insight gained from the analysis led us 
to suggest the potential use of a similar system as a public-key cryptosystem [13]. 

Public-key cryptography is based on a distribution of a public key which may 
be used to encrypt messages in a manner that can only be decrypted, in practical 
time scales, by the service provider. Several quite safe and efficient cryptosystems 
are currently in use such as RSA, elliptic-curves and the McEliece cryptosystem 
[15], most of which arc based on number theory methods. Public- key cryptography 
plays an important role in many aspects of modern information transmission, for 
instance, in the areas of electronic commerce and internet-based communication. 
It enables the service provider to distribute a public key which may be used to 
encrypt messages in a manner that can only be decrypted by the service provider. 

In the suggested cryptosystem, a plaintext represented by an N dimensional 
Boolean vector ^ G (0, 1)^ is encrypted to the M dimensional Boolean ciphertext J 
using a predetermined Boolean matrix G, of dimensionality MxN, and a corrupting 
M dimensional vector whose elements are 1 with probability p and otherwise, 
in the following manner J = G $, + C > where all operations are (mod 2). The 




matrix G and the probability p constitute the pubhc key; the corrupting vector 
C is chosen at the transmitting end. The matrix G, which is at the heart of the 
encryption/decryption process is constructed by choosing two randomly-selected 
sparse matrices A (MxN) and B (MxM), and a dense matrix D (NxN), defining 
G = B~^AD (mod 2) . The matrices A and B are generally characterised by K and 
L non-zero unit elements per row and C and L per column respectively; all other 
elements are set to zero. The finite, usually small, numbers K, C and L define a 
particular cryptosystem. The dense invertible Boolean matrix D is arbitrary and is 
added for improving the system's security. It may be constructed as = TP, where 
T and P are NxN triangular and random permutation matrices respectively, for 
minimising the computational costs. All matrices are known only to the authorised 
receiver. Suitable choices of probability p will depend on the maximal achievable 
rate for the particular cryptosystem as discussed below. 

The authorised user may decrypt the ciphertext J by taking the (mod 2) product 
BJ — A(D^) + BC, and finding the most probable solution to Eq.(l) using the 
methods of BP; obtaining the estimate of ^ is obtained by taking the product of 
the (-D^) estimate and D^^. Studying the case of i^' = L = 2 and p < Ps we 
learned that iterative BP decoding converges to the ferromagnetic solution from 
any initial conditions. Cryptosystems with other K, L values generally suffer from 
a decreasingly small basin of attraction as K and L increase, although specific 
matrices with higher K and L values (such as in [5]) may still be used successfully. 

The cryptosystem offers a guaranteed convergence to the plaintext solution, in 
the thermodynamic limit N ^ oo, ets long as p < Ps- The main consequence of 
finite plaintexts would be a decrease in the allowed corruption rate. Experimental 
results with systems size as small as A'" = 1000 still show good performance. 

The unauthorised receiver, on the other hand, faces the task of decrypting the 
ciphertext J knowing only G and p. The straightforward attempt to try all possible 
C constructions is clearly doomed, provided that p is not vanishingly small, giving 
rise to only a few corrupted bits. We study the problem by exploiting the similarity 
between the task at hand and the error- correcting model suggested by Sourlas [9]. 
In this case, the matrix G generated in the case of K = L = 2 is dense and has 
a certain distribution of unit elements per row. The fraction of rows with a low 
(finite, not of 0{N)) number of unit elements vanishes as N increases, allowing one 
to approximate this scenario by the diluted random energy model studied in [8]. 

To investigate the typical properties of this (frustrated) model, we calculate 
again the partition function and the free energy by averaging over the randomness 
in choosing the plaintext, the corrupting vector and the choice of the random ma- 
trix G (being generated by a product of two sparse random matrices). To assess 
the likelihood of obtaining spin- glass/ferromagnetic solutions, we calculated the 
free energy landscape (per plaintext bit - /) as a function of overlap m. This can 
be carried out straightforwardly using the analysis of [10], and provides the golf- 
course-like energy landscape with a relatively flat area around the one-step rephca 
symmetry breaking (frozen) spin-glass solution and a very deep but extremely nar- 
row area, of 0{1/N), around the ferromagnetic solution [13]. 



It is worthwhile mentioning that this free-energy landscape may be related di- 
rectly to the marginal posterior P{Si = 1|J) l<i< N and is therefore indicative 
of the difficulties in obtaining ferromagnetic solutions when the starting point for 
the search is not infinitesimally close to the original plaintext (which is clearly 
highly unlikely). Numerical studies of similar energy landscapes show that the 
time required increases exponentially with the system size [14] . 

Most attacks on this cryptosystems, by an unauthorised user, will face the same 
difficulty: without explicit knowledge of the current plaintext and/or the decom- 
position of G to the matrices A, B and D it will require an exponentially long time 
to decipher a specific ciphertext. We investigated attacks of several types, some of 
which appear in [13], concluding that the suggested system is secure. 

A brief comparison of our method and the leading technique of RSA [15] shows 
that: 1) RSA decryption takes 0{N^) operations while our method only requires 
0{N \ogN) operations. 2) Encryption costs arc of O(iV^) (as in RSA); inverting 
the matrices B and D is carried out only once and is of 0{N^). Two drawbacks of 
our method: 1) The public key is a dense matrix of dimensionality MxN. However, 
as public key transmission is carried out only once we do not expect it to be of 
great significance. 2) The ciphertext/plaintext bit ratio is greater than one (as 
is the case in RSA). Choosing the N/M ratio is in the hands of the user and is 
related to the security level required. In addition, the increased transmission time 
is compensated by a very fast decryption and the added robustness against noise. 

We discussed the relation between Ising models, certain error- correcting codes 
and public-key cryptosystems. Important aspects that are yet to be investigated 
include the relation between our results and the bounds obtained in the informa- 
tion theory literature for error-correcting codes, finite size effects and methods for 
alleviating the drawbacks of the new cryptosystem. 
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